在这项工作中,我们优化了基于无人机(UAV)的便携式接入点(PAP)的3D轨迹,该轨迹为一组接地节点(GNS)提供无线服务。此外,根据Peukert效果,我们考虑无人机电池的实用非线性电池放电。因此,我们以一种新颖的方式提出问题,代表了基于公平的能源效率度量的最大化,并被称为公平能源效率(费用)。费用指标定义了一个系统,该系统对每用户服务的公平性和PAP的能源效率都非常重要。该法式问题采用非凸面问题的形式,并具有不可扣除的约束。为了获得解决方案,我们将问题表示为具有连续状态和动作空间的马尔可夫决策过程(MDP)。考虑到解决方案空间的复杂性,我们使用双胞胎延迟的深层确定性政策梯度(TD3)参与者 - 批判性深入强化学习(DRL)框架来学习最大化系统费用的政策。我们进行两种类型的RL培训来展示我们方法的有效性:第一种(离线)方法在整个训练阶段保持GN的位置相同;第二种方法将学习的政策概括为GN的任何安排,通过更改GN的位置,每次培训情节后。数值评估表明,忽视Peukert效应高估了PAP的播放时间,可以通过最佳选择PAP的飞行速度来解决。此外,用户公平,能源效率,因此可以通过有效地将PAP移动到GN上方,从而提高系统的费用价值。因此,我们注意到郊区,城市和茂密的城市环境的基线情景高达88.31%,272.34%和318.13%。
translated by 谷歌翻译
在本文中,我们应用了一个多代理增强学习(MARL)框架,允许基站(BS)和用户设备(UES)共同学习频道访问策略及其在无线的多个访问方案中的信号。在此框架中,BS和UES是需要合作才能提供数据的增强剂学习(RL)代理。与无争议和基于争议的基线的比较表明,即使在高流量情况下,我们的框架在高速公路上也达到了卓越的性能,同时保持低碰撞率。研究了该方法的可伸缩性,因为它是MARL中的一个主要问题,本文提供了第一个结果以解决它。
translated by 谷歌翻译
In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by finding a policy that optimizes the worst-case performance over an uncertainty set of MDPs. While much of the literature has focused on discounted MDPs, robust average-reward MDPs remain largely unexplored. In this paper, we focus on robust average-reward MDPs, where the goal is to find a policy that optimizes the worst-case average reward over an uncertainty set. We first take an approach that approximates average-reward MDPs using discounted MDPs. We prove that the robust discounted value function converges to the robust average-reward as the discount factor $\gamma$ goes to $1$, and moreover, when $\gamma$ is large, any optimal policy of the robust discounted MDP is also an optimal policy of the robust average-reward. We further design a robust dynamic programming approach, and theoretically characterize its convergence to the optimum. Then, we investigate robust average-reward MDPs directly without using discounted MDPs as an intermediate step. We derive the robust Bellman equation for robust average-reward MDPs, prove that the optimal policy can be derived from its solution, and further design a robust relative value iteration algorithm that provably finds its solution, or equivalently, the optimal robust policy.
translated by 谷歌翻译
Masked Image Modelling (MIM) has been shown to be an efficient self-supervised learning (SSL) pre-training paradigm when paired with transformer architectures and in the presence of a large amount of unlabelled natural images. The combination of the difficulties in accessing and obtaining large amounts of labeled data and the availability of unlabelled data in the medical imaging domain makes MIM an interesting approach to advance deep learning (DL) applications based on 3D medical imaging data. Nevertheless, SSL and, in particular, MIM applications with medical imaging data are rather scarce and there is still uncertainty. around the potential of such a learning paradigm in the medical domain. We study MIM in the context of Prostate Cancer (PCa) lesion classification with T2 weighted (T2w) axial magnetic resonance imaging (MRI) data. In particular, we explore the effect of using MIM when coupled with convolutional neural networks (CNNs) under different conditions such as different masking strategies, obtaining better results in terms of AUC than other pre-training strategies like ImageNet weight initialization.
translated by 谷歌翻译
We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an autoregressive model, based on graph neural networks and a novel high-resolution multi-scale mesh representation, which we trained on historical weather data from the European Centre for Medium-Range Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer resolution at the equator. Our results show GraphCast is more accurate than ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the 2760 variable and lead time combinations we evaluated. GraphCast also outperforms the most accurate previous ML-based weather forecasting model on 99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast (35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike traditional forecasting methods, ML-based forecasting scales well with data: by training on bigger, higher quality, and more recent data, the skill of the forecasts can improve. Together these results represent a key step forward in complementing and improving weather modeling with ML, open new opportunities for fast, accurate forecasting, and help realize the promise of ML-based simulation in the physical sciences.
translated by 谷歌翻译
Modern machine learning pipelines are limited due to data availability, storage quotas, privacy regulations, and expensive annotation processes. These constraints make it difficult or impossible to maintain a large-scale model trained on growing annotation sets. Continual learning directly approaches this problem, with the ultimate goal of devising methods where a neural network effectively learns relevant patterns for new (unseen) classes without significantly altering its performance on previously learned ones. In this paper, we address the problem of continual learning for video data. We introduce PIVOT, a novel method that leverages the extensive knowledge in pre-trained models from the image domain, thereby reducing the number of trainable parameters and the associated forgetting. Unlike previous methods, ours is the first approach that effectively uses prompting mechanisms for continual learning without any in-domain pre-training. Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
translated by 谷歌翻译
Simulating rigid collisions among arbitrary shapes is notoriously difficult due to complex geometry and the strong non-linearity of the interactions. While graph neural network (GNN)-based models are effective at learning to simulate complex physical dynamics, such as fluids, cloth and articulated bodies, they have been less effective and efficient on rigid-body physics, except with very simple shapes. Existing methods that model collisions through the meshes' nodes are often inaccurate because they struggle when collisions occur on faces far from nodes. Alternative approaches that represent the geometry densely with many particles are prohibitively expensive for complex shapes. Here we introduce the Face Interaction Graph Network (FIGNet) which extends beyond GNN-based methods, and computes interactions between mesh faces, rather than nodes. Compared to learned node- and particle-based methods, FIGNet is around 4x more accurate in simulating complex shape interactions, while also 8x more computationally efficient on sparse, rigid meshes. Moreover, FIGNet can learn frictional dynamics directly from real-world data, and can be more accurate than analytical solvers given modest amounts of training data. FIGNet represents a key step forward in one of the few remaining physical domains which have seen little competition from learned simulators, and offers allied fields such as robotics, graphics and mechanical design a new tool for simulation and model-based planning.
translated by 谷歌翻译
基于连续的潜在空间(例如变异自动编码器)的概率模型可以理解为无数混合模型,其中组件连续取决于潜在代码。它们具有用于生成和概率建模的表达性工具,但与可牵引的概率推断不符,即计算代表概率分布的边际和条件。同时,可以将概率模型(例如概率电路(PC))理解为层次离散混合模型,从而使它们可以执行精确的推断,但是与连续的潜在空间模型相比,它们通常显示出低于标准的性能。在本文中,我们研究了一种混合方法,即具有较小潜在尺寸的可拖动模型的连续混合物。尽管这些模型在分析上是棘手的,但基于一组有限的集成点,它们非常适合数值集成方案。有足够数量的集成点,近似值变得精确。此外,使用一组有限的集成点,可以将近似方法编译成PC中,以“在近似模型中的精确推断”执行。在实验中,我们表明这种简单的方案被证明非常有效,因为PC在许多标准密度估计基准上以这种方式为可拖动模型设定了新的最新模型。
translated by 谷歌翻译
由于全景分割为输入中的每个像素提供了一个预测,因此,非标准和看不见的对象系统地导致了错误的输出。但是,在关键的环境中,针对分发样本的鲁棒性和角案件对于避免危险行为至关重要,例如忽略动物或道路上的货物丢失。由于驾驶数据集不能包含足够的数据点来正确采样基础分布的长尾巴,因此方法必须处理未知和看不见的方案才能安全部署。以前的方法是通过重新识别已经看到未标记的对象来针对此问题的一部分。在这项工作中,我们扩大了提出整体分割的范围:一项任务,以识别和将看不见的对象分为实例,而无需从未知数中学习,同时执行已知类别的全面分割。我们用U3HS解决了这个新问题,U3HS首先将未知数视为高度不确定的区域,然后将相应的实例感知嵌入到各个对象中。通过这样做,这是第一次使用未知对象进行综合分割,我们的U3HS未接受未知数据的训练,因此使对象类型的设置不受限制,并允许对整体场景理解。在两个公共数据集上进行了广泛的实验和比较,即CityScapes和作为转移的丢失和发现,证明了U3HS在挑战性的整体分段任务中的有效性,并具有竞争性的封闭式全盘分段性能。
translated by 谷歌翻译
最近的研究表明,犯罪网络具有复杂的组织结构,但是是否可以用来预测犯罪网络的静态和动态特性。在这里,通过结合图表学习和机器学习方法,我们表明,可以使用政治腐败,警察情报和洗钱网络的结构性特性来恢复缺失的犯罪伙伴关系,区分不同类型的犯罪和法律协会以及预测犯罪分子之间交换的总金额,所有这些都具有出色的准确性。我们还表明,我们的方法可以预期在腐败网络的动态增长过程中,其准确性很高。因此,与在犯罪现场发现的证据类似,我们得出结论,犯罪网络的结构模式具有有关非法活动的重要信息,这使机器学习方法可以预测缺失的信息,甚至预测未来的犯罪行为。
translated by 谷歌翻译